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Abstract 


This document describes a recommended syntax for writing the string rep¬ 
resentation of unit labels (‘VOUnits’). In addition, it describes a set of 
recognised and deprecated units, which is as far as possible consistent with 
other relevant standards (BIPM, ISO/IEC and the lAU). 

The intention is that units written to conform to this specification will 
likely also be parsable by other well-known parsers. To this end, we include 
machine-readable grammars for other units syntaxes. 


Status of this document 

This document has been produced by the IVOA Semantics Working Group. 
It has been reviewed by IVOA Members and other interested parties, and 
has been endorsed by the IVOA Executive Committee as an IVOA Recom¬ 
mendation. It is a stable document and may be used as reference material 
or cited as a normative reference from another document. IVOA’s role in 
making the Recommendation is to draw attention to the specification and 
to promote its widespread deployment. This enhances the functionality and 
interoperability inside the Astronomical Community. 

The place for discussions related to this document is the Semantics IVOA 
mailing list semantics@ivoa.net. 

A list of current IVOA recommendations and other technical documents 
can be found at http://www.ivoa.net/Documents/. 

Note on conformance 

Text within the following document is classified as either ‘normative’ or 
‘informative’. 

Normative text means information that is required to implement the 
Recommendation; an implementation of this Recommendation is conformant 
if it abides by all the prescriptions contained in normative text. Informative 
text is information provided to clarify or illustrate a requirement but which 
is not required for conformance. 

The sections and subsections of this Recommendation are labeled, after 
the section heading, to specify whether they are normative or informative. If 
a subsection is not labeled, it has the same normativity as its parent section. 
References are normative if they are referred to within normative text. 

When found within normative sections, the key words must, must not, 
required, shall, shall not, should, should not, recommended, may, 
optional, thus formatted, are to be interpreted as described in RFC 2119 
(?)• 
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1 Introduction (informative) 

This docnment describes a standardised nse of nnits in the VO (hereafter 
simply ‘VOUnits’). It aims to describe a syntax for nnit strings which is as 
far as possible in the intersection of existing syntaxes, and to list a set of 
‘known nnits’ which is the nnion of the ‘known nnits’ of those standards. We 
recommend, therefore, that applications which write ont nnits shonld do so 
nsing only the VOUnits syntax, and that applications reading nnits shonld 
be able to read at least the VOUnits syntax, pins all of the nnits of Sect. 2.4. 
It is not, however, qnite possible for VOUnits to be in the intersection of 
existing syntaxes; there is fnther discnssion of this point in Sect. 2.12.1. 

We also provide, for information, a set of self- and mntnally-consistent 
machine-readable grammars for all of the syntaxes discnssed. 

The introdnction gives the motivation for this proposal in the context 
of the VO architectnre, from the legacy metadata available in the resonrce 
layer, to the reqnirements of the varions VO protocols and standards and 
applications. 

This docnment is organised as follows. Sect. 2 details the proposal for 
VOUnits. Sect. 3 lists some nse cases and reference implementations. In 
Appx. A, there is a brief review of cnrrent practices in the description and 
nsage of nnits; in Appx. B there is a detailed discnssion of the differences 
between the varions syntaxes; and in Appx. C there are formal (yacc-style) 
grammars for the fonr syntaxes discnssed. 

The normative content of this docnment is Sect. 2 and Appx. C.4. 

1.1 Units in the VO Architecture 

Generally, every qnantity provided in astronomy has a nnit attached to its 
valne or is nnitless (e.g., a ratio, or a nnmerical mnltiplier). 

Units lie at the core of the VO architectnre, as can be seen in Fig. 1. 
Most of the existing data and metadata collections accessible in the resonrce 
layer have some legacy nnits, which are mandatory for any scientific nse of 
the corresponding data. Units can be embedded in data (e.g., FITS headers) 
or be implied by convention and/or (preferably) specified in metadata. 
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Figure 1: Units is a core building block in the VO. Most parts of the archi¬ 
tecture rely on it: the User Layer with tools and clients, the Resource Layer 
with data. Protocols, registries entries, and data models also re-use these 
Units dehnitions. 

Units also appear in the VOTable format (?), through the use of a unit 
attribute that can be used in the FIELD, PARAM and INFO elements. Because 
of the widespread dependency of many other VO standards on VOTable, 
these standards inherit a dependency on Units. 

The Units also appear in many Data Models, through the use of dedicated 
elements in the models and schemas. At present, each VO standard either 
refers to some external reference document, or provides explicit examples of 
the Units to be used in its scope, on a case-by-case basis. 

The registry records can also contain units, for the description of ta¬ 
ble metadata. The dehnition of VO Data Access protocols uses units by 
specifying in which units the input parameters have to be expressed, or by 
restricting the possible units in which some output must be returned. 

And last but not least, tools can interpret units, for example to display 
heterogeneous data in a single diagram by applying conversions to a reference 
unit on each axis. 
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1.2 Adopted terms and notations 

Discussions about units often suffer from misunderstandings arising from 
cultural differences or ambiguities in the adopted vocabulary. For the sake 
of clarity, in this document, the following concepts are used: 

A quantity is the combination of a (numerical) value, measured for a 
concept and expressed in terms of a given unit] there may be other structure 
to a quantity, such as uncertainty or even provenance. In the VO context, 
the nature of the concept can be expressed with a UCD or a utype. This doc¬ 
ument does not address the full issue of representing quantities, but focusses 
on the unit part. 

A unit can be expressed in various forms: in natural language (e.g., me¬ 
tres per second squared), with a combination of symbols with typographic 
conventions (e.g., m s“^), or by a simplified text label (e.g., m.s-2). VOUnit 
deals with the label form, which is easier to standardize, parse and exchange. 
A VOUnit corresponds in the most general case to a combination of several 
(possibly prefixed) symbols with mathematical operations expressed in a 
controlled syntax. 

A unit consists of a sequence of unit components, each of which rep¬ 
resents a base unit, possibly modified by a multiplicative prefix (of one or 
two characters), and raised to an integer or rational power. The whole unit 
may (in some syntaxes) be prefixed by a numerical scale-factor. 

Each of the base units (for example, the metre) is represented by a 
base symbol (for example m). Each syntax has a number of known units 
(Sect. 2.4), for each one of which there is at least one symbol which identifies 
only that unit. 

A symbol is either a base symbol or a base symbol with a scaling prefix. 

For example, in the unit of 1. 663e-lmm. s**-l, the scale-factor is 1.663 x 
10“^, the two unit-components are mm and s**-l; the first symbol has base 
symbol m and prefix m (for ‘milli’), and the second has base symbol s, no 
prefix, and the power —1. 

1.3 Purpose of this document 

The purpose of this document is to provide a reference specification of how 
to write VOUnits, in order to maximize interoperability within the VO; the 
intention is that VOUnit strings should be reliably parsable by humans and 
computers, with a single interpretation. This is broadly the case for the other 
existing unit-string syntaxes, although there are some slight ambiguities in 
the specifications of these syntaxes (cf Appx. C). We therefore include a set 
of self- and mutually-consistent machine-readable grammars for all of the 
syntaxes discussed. 

The unit syntax(es) described here are intended to be human-readable, 
to the extent that, for example, a string such as mm. s**-2 is human-readable 
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(without this restriction, we could easily define a much more regular machine- 
to-machine grammar). Having an explicit unit-string grammar means that 
data providers can write human-readable strings in the confidence that the 
result will additionally be machine-readable in a reliable and checkable way. 
Or, where a string is not fully machine readable (because a data provider 
needs to use a custom unit such as ’JupiterMass’; see Sect. 2.11), that the 
string is at least partially machine readable, and that that partial readability 
is non-ambiguous. 

We aim not to reinvent the wheel, and to be as compliant as possible 
with legacy metadata in major archives, and astronomers’ habits. 

In particular: 

• We describe (Appx. A) a number of existing unit syntaxes, and mention 
some ambiguities in their definition. Application authors should expect 
to encounter each of the syntaxes mentioned in this document (FITS, 
OGIP and CDS); all of these are broadly endorsed by this specification. 

• In addition to the unit syntaxes described above, there are multiple 
specifications of base and known units (we refer, in particular, to spec¬ 
ifications from BIPM, ISO/IEC and the lAU); these are broadly, but 
not completely, mutually consistent. 

• Where there are some ambiguities in, or contradictions between, these 
various specifications, we recommend that application authors should 
resolve them as indicated in this specification. 

• This document defines a syntax, called ‘VOUnits’, which is as far as 
is feasible in the intersection of the three existing syntaxes, and which 
we recommend that applications should use when writing unit strings. 
This aim is not quite possible in fact, and the extensions to it, and the 
mild deviations from it, are discussed below in Sect. 2 and Appx. C; 
there is a summary of the various units in Table 2 on page 14. 

1.4 What this document will not do 

This Recommendation does not prescribe what units data providers employ, 
except to the extent that we avoid giving a standard interpretation for a unit 
in some cases (for example we do not acknowledge the degree Celsius or the 
century as units). Since we do not forbid ‘unrecognised’ units, this need not 
restrict data providers. Nor do we demand that a given quantity be expressed 
in a unique way (e.g., all distances in m). So long as data is labelled in a 
recognised system, a translation layer can be provided. Data providers can 
customise the translation tools if required. Depending on preference and 
the operations required, the user may have a choice of units for his or her 
query and for the result. In particular, the Recommendation does not require 
that only recognised units are used. While it is obviously desirable for data 
providers to use recognised and non-deprecated units where possible, there 
are occasions when this is unnecessary or undesirable. 
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This Recommendation does not discuss quantities at all. That is, we do 
not discuss the combination of number and unit which refers to a particular 
physical measurement, such as ‘2ms“^’. Though this might appear to be a 
trivial extension, it raises questions of the representation of decimal numbers, 
the representation of uncertainties, questions of unit conversion, and other 
data-modelling imponderables which have in the past, possibly surprisingly, 
generated a great deal of discussion within the IVOA without, so far, a 
generally acceptable resolution. 

This Recommendation describes only isolated units, and not arrays, 
records or other combinations of units. Several VO protocols require em¬ 
bedding complex objects into result tables, and give string serializations for 
those: geometries in TAP results are the most common example. This speci¬ 
fication does not cover this situation, although we hope that where individual 
unit strings are required in such instances, their syntax will conform to, or 
include, this specification by reference. 

In general, this Recommendation is concerned almost exclusively with 
the syntactic question of what is and is not a valid unit string, leaving most 
questions of interpretation or enforcement to a higher layer in an application 
stack. Specifically: 

• The specification does not forbid ‘unknown’ units. An implementation 
of this specification should be able to recognise, and communicate, that 
a unit is unknown, but it is not required to reject a unit string on the 
grounds that it is unrecognised. 

• Similarly, although Table 2 on page 14 forbids some units from having 
SI prefixes, a VOUnit implementation should not itself reject a unit 
string which incorrectly includes a prefix, but should instead just make 
available the information that this has been detected, and that it is 
deprecated. 

• The list of known units in Sect. 2.4 is not specific about the precise 
definitions of the units in question; for example, it refers to the ‘second’ 
without distinguishing between the various possible definitions that the 
second may have. See that section for further discussion. 

• This Recommendation does not specify how an application should com¬ 
pare units for equivalence; for example, an application may or may not 
wish to deem m/s and km/s to be ‘equivalent’. This Recommendation, 
similarly, does not specify how to compare units with scale-factors (cf 
Sect. 2.6). 


2 The VOUnits syntax (normative) 

The rules for VOUnits are defined in this section. Various aspects are ad¬ 
dressed: 

• how the labels are encoded; 
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• what base symbols are allowed and how they are spelled; 

• what prefixes are allowed and how they are used; 

• how symbols are combined. 

A formal grammar summarizing these conventions is given in Appx. C.4. 

The text below is expected to be compatible with the prescriptions of 
the SI standard (?), except where noted. 

2.1 String representation and encoding 

VOUnits may occur in legacy contexts, in which the presence of non-ASCII 
characters may cause considerable technical inconvenience (for example FITS 
cards). There are only a few non-ASCII characters which we might wish to 
include in unit strings (for example A or p), and we can find substitutes for 
these sufficiently easily, that we feel there is little real benefit in permitting 
non-ASCII characters in VOUnit strings. 

All the VOUnit characters in the specification below are printable ASCII 
characters (that is, in the range hexadecimal 20 to 7E); any extensions to 
this standard should be restricted to this same range. 

All VOUnit strings must be regarded as case-sensitive (the strings in the 
other syntaxes are also case-sensitive). 

2.2 Parsing unit strings — overview 

The unit strings unknown and UNKNOWN (that is, in all-lowercase or all-uppercase) 
are reserved for cases when the unit is unknown; that is, it is known that 
there should be a unit, but the unit string has been lost or not been speci¬ 
fied. These strings are not, however, part of the list of known units or the 
VOUnits grammar, and applications must check for their presence before 
unit parsing. 

An empty unit string positively indicates that the corresponding quantity 
is dimensionless. Since an empty string does not conform to the grammars 
below, this also must be checked for before unit-parsing starts. 

A symbol within a unit-component should be parsed as follows: 

1. If it corresponds to a known base symbol, then it must be recognised 
as such (for example the Pa must be parsed as the known Pascal, and 
never as the peta-year). 

2. If the symbol starts with a multiplicative prefix, then this is recognised 
independently of whether the resulting base symbol is a known or un¬ 
known unit - thus Mm and Mfurlong are parsed as millions of metres 
and furlongs, but note that this implies, for the sake of consistency, 
that furlong is parsed as the femto-‘urlong’. 

3. In the VOUnits syntax (a significant divergence from the other syn¬ 
taxes), base symbols may be put between single quotes ’. .. ’ (ASCII 
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character 2Tig)- Such symbols must be parsed as unrecognised unit 
symbols which are not further examined. See Sect. 2.11 for discussion. 

A library which implements this specification should be able to distin¬ 
guish known and unknown units, and identify deviations from the restrictions 
on their use, below. It should be able to communicate such information to a 
caller, but it should not unilaterally reject unit strings which use unknown 
units or use known units in disapproved ways (of course, a higher-level ap¬ 
plication is free to reject unit strings for any reason it pleases). 

2.3 Base units 

There is good agreement for the base symbols across the different schemes 
(see Table 10 on page 28). 

The VOUnits base symbols are listed in Table 1 


m 

(metre) 

g 

(gram) 

J 

(joule) 

Wb 

(weber) 

s 

(second) 

rad 

(radian) 

W 

(watt) 

T 

(tesla) 

A 

(ampere) 

sr 

(steradian) 

c 

(coulomb) 

H 

(henry) 

K 

(kelvin) 

Hz 

(hertz) 

V 

(volt) 

Im 

(lumen) 

mol 

(mole) 

N 

(newton) 

s 

(siemens) 

lx 

(lux) 

cd 

(candela) 

Pa 

(pascal) 

F 

(farad) 

Ohm 

(ohm) 


Table 1: VOUnits base units 

For masses, the SI unit is kg. However, existing specifications recommend 
not using scale-factors with kg, but attaching them only to g instead. 

Recognising a known unit takes priority over parsing for prefixes. Thus 
the string Pa represents the Pascal, and not the peta-year, and the string mol 
will always be the mole, and never a milli-‘ol’, for some unknown unit ‘ol’. 

2.4 Known units 

In Table 2 on page 14, we indicate the ‘known units’ for each of the described 
syntaxes, which go beyond the physically motivated set of base units. There 
are a few units (namely angstrom or Angstrom, pix or pixel, ph or photon 
and a or yr) for which there are recognised alternatives in some syntaxes, 
and in these cases ‘p’ marks the preferred one. 

This list of known units is not specific about the precise definitions of 
the units in question; for example, it refers to the ‘second’ without distin¬ 
guishing between the various possible definitions that the second may have: 
they may be mean-solar or atomic seconds, and be defined at different points 
in spacetime. Generally, when data is exchanged in those areas where such 
distinctions matter - such as data connected with pulsar timings - the fine 
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semantic details will be indicated by the data provider through other mecha¬ 
nisms. That said, a VOUnits processor must interpret the symbols of Table 2 
on the following page compatibly with the indicated units: a m is always a 
metre of one type or another, and may not be interpreted as, for example, a 
minute. 

Unrecognised units should be accepted by parsers, as long as they are 
parsed giving preference to the syntaxes and prefixes described here. Thus, 
for example, the string furlong/week should parse successfully (though per¬ 
haps with suitably prominent warnings) as the femto-‘urlong’ per week. 

The Unity library (Sect. 3.2) recognises units with respect to a subset of 
the QUDT unit framework ?, with some astronomy-specihc additions. This 
is a particularly comprehensive collection of units, and we commend it to 
the IVOA community as a lingua franca for this type of work. 

Sections 2.5 to 2.8 below, discussing the set of known units, are longer 
than one might expect would be necessary. Most of the discussion concerns 
rather arcane edge-cases, or attempts to reconcile the minor deviations be¬ 
tween the relevant existing standards. In all cases, we have attempted to be 
as uninnovative and unsurprising as possible. 

Future versions of this specihcation may add to the set of known units. 

2.5 Binary units 

The symbol ‘b’ is sometimes used for ‘bits’, but this is the SI symbol for 
‘barn’, and this Recommendation aligns with the SI standard in this respect. 
Since the same symbol is sometimes used for ‘bytes’, it is probably best 
avoided in any case. 

?, item 13-9.c notes that the term ‘byte’ ‘has been used for numbers of bits 
other than eight’ in the past, but that it should now always be used for eight- 
bit bytes; we recommend the same interpretation here. The same source 
notes the theoretical confusion between the symbol B for ‘byte’ and for ‘Bel’. 
We believe it would be perverse in our present context to recommend against 
using ‘B’ for byte, and resolve this here in favour of ‘byte’ by mandating 
that B must be parsed as indicating the ‘byte’, that the dB is an unprehxable 
special-case unit (as discussed below), and by implication that the ‘dB’ must 
not be interpreted as a tenth of a byte.^ 

2.6 Scale factors 

Units may be prehxed by any of the 20 SI scale-factors, and a subset may 
be prehxed by the eight binary scale-factors. The SI scale-factors - provided 
in Table 3a - are the same as those of ?, of ?, §6.5.4, and of ?, Table 5 (see 
also Table 11 on page 29 for further comparisons). 

^We have no evidence that this has been a common source of confusion within the 
IVOA, or indeed anywhere else. 
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symbol 

description 

fits 

ogip 

cds 

vou 

symbol 

description 

fits 

ogip 

cds 

vou 

% 

percent 





Jy 

Jansky 

s 

s 

s 

s 

A 

ampere 

s 

s 

s 

s 

K 

kelvin 

s 

s 

s 

s 

a 

Julian year 

s 


s 

s 

Im 

lumen 

s 

s 

s 

s 

adu 

ADU 




s 

lx 

lux 

s 

s 

s 

s 

Angstrom 

angstrom 

d 



dp 

lyr 

light year 




s 

angstrom 

angstrom 




d 

m 

meter 

s 

s 

s 

s 

arcmin 

arc minute 




s 

mag 

magnitudes 

s 


s 

s 

arcsec 

arc second 



s 

s 

mas 

milliarcsecond 





AU 

astronomical unit 




p 

min 

minute (time) 




s 

an 

astronomical unit 





mol 

mole 

s 

s 

s 

s 

Ba 

besselian year 




d 

N 

newton 

s 

s 

s 

s 

barn 

barn 

sd 


s 

sd 

Ohm 

ohm 

s 


s 

s 

beam 

beam 




s 

ohm 

ohm 


s 



bin 

bin 




s 

Pa 

pascal 

s 

s 

s 

s 

bit 

bit 

s 


s 

sb 

pc 

parsec 

s 

s 

s 

s 

byte 

byte 

s 


s 

sbp 

ph 

photon 




s 

B 

byte 




sb 

photon 

photon 

p 



sp 

C 

coulomb 

s 

s 

s 

s 

pix 

pixel 




s 

cd 

candela 

s 

s 

s 

s 

pixel 

pixel 

p 



sp 

chan 

channel 




s 

R 

rayleigh 

s 



s 

count 

number 




sp 

rad 

radian 

s 

s 

s 

s 

Crab 

crab 


s 



Ry 

rydberg 



s 

s 

ct 

number 




s 

s 

second (time) 

s 

s 

s 

s 

cy 

Julian century 





S 

siemens 

s 

s 

s 

s 

d 

day 




s 

solLum 

luminosity 




s 

dB 

decibel 





solMass 

solar mass 




s 

D 

debye 




s 

solRad 

solar radius 




s 

deg 

degree (angle) 




s 

sr 

steradian 

s 

s 

s 

s 

erg 

erg 

d 



sd 

T 

tesla 

s 

s 

s 

s 

eV 

electron volt 

s 

s 

s 

s 

ta 

year tropical 




d 

F 

farad 

s 

s 

s 

s 

u 

AMU 




s 

g 

gramme 

s 

s 

s 

s 

V 

volt 

s 

s 

s 

s 

G 

gauss 

sd 



sd 

voxel 

voxel 




s 

H 

henry 

s 

s 

s 

s 

W 

watt 

s 

s 

s 

s 

h 

hour 




s 

Wb 

weber 

s 

s 

s 

s 

Hz 

hertz 

s 

s 

s 

s 

yr 

Julian year 

sp 


sp 

sp 

J 

Joule 

s 

s 

s 

s 








Table 2: Known units in the various syntaxes. In the table, and for a given 
syntax, a indicates that the unit is recognised, an ‘s’ that it is additionally 
permitted to have SI prefixes, a ‘b’ that binary prefixes will be recognised, 
and a ‘d’ that it is recognised but deprecated. For those units which have 
alternative symbols for a given unit, a ‘p’ indicates the one preferred for 
output. 
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da 

deca, 10^ 

d 

deci, 10“^ 

h 

hecto, 10^ 

c 

centi, 10“^ 

k 

kilo, 10^ 

m 

milli, 10“^ 

M 

mega, 10® 

u 

micro, 10“® 

G 

giga, 10^ 

n 

nano, 10“® 

T 

tera, 10^^ 

p 

pico, 10“^^ 

P 

peta, 10^® 

f 

femto, 10“^® 

E 

exa, 10^® 

a 

atto, 10“^® 

Z 

zetta, 10^^ 

z 

zepto, 10“^^ 

Y 

yotta, 10^^^ 

y 

yocto, 10“^^ 


Ki 

kibi. 

2io 

Mi 

mebi 

220 

Gi 

gibi. 

230 

Ti 

tebi. 

240 

Pi 

pebi. 

250 

Ei 

exbi. 

260 

Zi 

zebi. 

2™ 

Yi 

yobi. 

280 


Table 3: VOUnits prefixes: (a, left) decimal prefixes; (b, right) binary pre¬ 
fixes 

Writers of unit strings must not use compound prefixes (that is, more 
than one SI prefix). Prefixes are concatenated to the base symbol without 
space, and must not be used without a base symbol. 

The SI prefixes of Table 3a must always refer to multiples of 1000, even 
when applied to binary units such as bit or byte; this follows the stipulations 
(and clarifying note) of ?, §3.1, and of ?, §6.5.4. If data providers wish to 
use multiples of 1024 (ie, 2^^) for units such as bytes or bits, they must use 
the the binary prefixes of ?, §4, reproduced in Table 3b (these originated in 
?)• 

Note 1: the ‘s’ and ‘b’ annotations in Table 2 on the preceding page are 
not symmetric: the ‘s’ annotation indicates that SI prefixes are permitted in 
the given syntax, which means that they are also recognised when preceding 
unknown units (which have no restrictions on them); in contrast, binary pre¬ 
fixes are recognised exclusively on units with a ‘b’ annotation, which means 
that they are not recognised with unknown units. That is, the Mifurlong is 
the mega-ifurlong (because it starts with the decimal ‘M’ prefix) and the 
Kifurlong is the unknown unit Kifurlong (because ‘K’ is not a decimal pre¬ 
fix, and the binary prefix ‘Ki’ is not recognised when prefixed to an unknown 
unit). 

Note 2: The letter u is used instead of the p. symbol to represent a factor 
of 10“®, following the character set defined in Sect. 2.1. 

Note 3: The FITS deprecations in Table 2 on the previous page are those 
where that standard’s Table 4 notes that the unit is “Deprecated in lAU Style 
Manual but still in use.” The VOUnits deprecations inherit these, and add 
(by community acclaim) the besselian and tropical years. 

Note 4: The unit names ‘angstrom’ and ‘ohm’ are correctly spelled with 
a lowercase initial letter (as required by ?), but in both cases their usual unit 
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min 

(minute of time) 

deg 

(degree of angle) 

Jy 

(Jansky) 

h 

(hour of time) 

arcmin 

(arcminute) 

pc 

(parsec) 

d 

(day) 

arcsec 

(arcsecond) 

eV 

(electron volt) 

a, yr 

(year) 

mas 

(milliarcsecond) 

AU 

(astronomical 

u 

(atomic mass) 




unit) 


Table 4: Additional astronomy symbols 


symbol is a non-ASCII character, so that their all-ASCII version here must be 
a word. Following the lead of the FITS and CDS standards, and of SI unit- 
symbols derived from surnames, and disagreeing with the OGIP standard, 
we have preferred the initial-capital symbol in VOUnits (thus ‘Angstrom’ as 
the unit symbol, rather than ‘angstrom’). 

2.7 Astronomy symbols 

Table 12 on page 30 lists symbols used in astronomy to describe times, angles, 
distances and a few additional quantities. The subset of these used by this 
specification are listed in Table 4. 

Minutes, hours, and days of time must be represented in VOUnits by 
the symbols min, h and d; however the cd is the candela, not the centi-day.^ 
The year may be expressed by yr (common practice), or a, as recommended 
by ISO (?, Annex C) and the lAU (?, Table 6). However peta-year must 
only be written Pyr, to avoid the collision with the pascal. Pa. 

There are no VOUnit symbols for degrees Celsius or century. Temper¬ 
atures are expressed in kelvin (k), and a century corresponds to ha or hyr. 
Note that this is a mild deviation from the SI standard, which states that 
the ‘hectare’, with unit symbol ha, is a ‘non-SI unit accepted for use’ as a 
measure of land area (?, table 6), and which acknowledges neither ‘a’ nor 
‘yr’ as a symbol for year.^ 

The astronomical unit should be expressed in upper-case, AU, in order to 
follow legacy practice. It may also be written au, in the VOUnits syntax, on 
the ground that it would be perverse to prefer the atto-atomic-mass to the 
astronomical unit, in an astronomical unit specification. This is a deviation 
from the SI recommendation of ‘ua’ (?, Table 7), but conformant with the 
lAU’s recommendation of ‘an’ (?).^ 

■^We therefore rule out interpreting dB/cd as 0.9 mbit/s. 

large telescope arrays feel they must talk of attojoules per hectare per century, for 
some reason, they’re going to have to be careful how they do so; it’s probably best not to 
even think about atto-Henrys. 

'^If you feel a burning desire to write about micro-years or atto atomic-mass, this 
document is not the place you need to look for help. 
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Because of the near-degeneracy between the decimal prefixes d and da, 
there is an ambiguity when parsing the unit dadu - is this the deka-du or the 
deci-adu? The only cases where this ambiguity is possible are those involving 
known units starting with ‘a’ (da is unambiguously a deci-year for the same 
reason that d is unambiguously a day, because the presence of a bare unit 
prefix would be ungrammatical). We can think of no cases where the prefix is 
useful enough that resolving the ambiguity is worth the specification effort, 
so we deem the parse of da.* to be unspecified. In consequence, data 
providers must not use the da prefix, and should not use the d prefix (as 
noted in Sect. 2.8, the decibel, dB is listed as a ‘known unit’, as opposed to 
a deci-Bel). 

2.8 Other symbols, and other remarks 

Table 13 on page 31 corresponds to Table 7 in the lAU document, and the 
lAU strongly recommends no longer using these units. Data producers are 
strongly advised to prefer the equivalent notation using symbols and prefixes 
listed in Tables 10, 11 and 12. 

However, in order to be compatible with legacy metadata, VOUnit parsers 
should be able to interpret symbols angstrom or Angstrom (for angstrom), 
barn, erg and G (for gauss). 

Table 14 on page 32 compares other miscellaneous symbols. The last set 
of VOUnits symbols, derived from this comparison, is in Table 5. 


mag (magnitude) 

pix or pixel 

solMass (solar 
mass) 

R (rayleigh) 

Ry (rydberg) 

voxel 

solLum (solar 
luminosity) 

chan (channel) 

lyr (light year) 

bit 

solRad (solar 
radius) 

bin 

ct or count 

byte (8 bits) 

Sun (relative to 
the Sun, e.g. 
abundances) 

becun 

ph or photon 

adu 

D (Debye) 

unknown (Sect. 2.2) 


Table 5: Miscellaneous VOUnits. 

A few symbols which might theoretically be ambiguous are listed in Ta¬ 
ble 6, with their consensus VOUnit interpretation. 

It can be noted that some of the units listed in Table 14 on page 32 are 
questionable. They arise in fact from a need to describe quantities, when 
the only piece of metadata available is the unit label. Count, photon, pixel, 
bin, voxel, bit, byte are concepts, just as apple or banana. The associated 
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VOUuit 

Correct interpretation 

Incorrect 

Pa 

pascal 

peta-year 

ha 

hecto-year 

hectare 

cd 

candela 

centi-day 

dB 

decibel 

deci-byte 

B 

byte 

bel 

au 

astronomical unit 

atto-atomic-mass 


Table 6: Possibly ambiguous units 


quantities could be fully described with a UCD, a value and a void unit 
label. It is possible to count a number of bananas, or to express a distance 
measured in bananas, but this does not make a banana a reference unit. 

The FITS document provides the most general description of all the 
compared schemes, and VOUnits adopts similar definitions, for the sake of 
legacy metadata. The VOUnits symbol for magnitudes is mag. Note that all 
symbols like count, photon, pixel are always used in lower case and singular 
form. 

The decibel, dB is listed in the SI specification (?, Table 8) amongst 
a set of ‘other non-SI units’, and mentioned by ?, §0.5 in a ‘Remark on 
logarithmic quantities’. The dB is included in the list of ‘known units’ of 
Table 2 on page 14 and so must be parsed as a unit by itself - as opposed to 
being parsed as the prefix ‘d’ qualifying the unit ‘Bel’ - and both the decibel 
and Bel must uot be used with other scaling prefixes. 

If there is no unit associated with a quantity (for example a quantity 
that is a character string, or unitless), data providers should indicate this 
with an empty string rather than blanks or dashes. 

2.9 Mathematical expressions containing symbols 

Table 15 on page 33 summarizes how, in the various existing syntaxes, math¬ 
ematical operations may be applied on unit symbols for exponentiation, mul¬ 
tiplication, division, and other computations. 

The combination rules are where the largest discrepancies between the 
different schemes appear. The FITS document discusses the problem of try¬ 
ing to best accommodate the existing schemes (?, §4.3.1), without really 
resolving the problem. This and other ambiguities are discussed in the de¬ 
tailed syntaxes of Appx. C. 

VOUnits follow a subset of the FITS rules, as summarized in Table 7. 

As illustrated in Table 7, units may include a limited set of functional 
dependencies on other units. The set of functions recognised within VOUnits 
is the same as the set recommended by FITS, and listed in Table 8. As with 
unrecognised units, parsers should accept unrecognised functions without er- 
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strl.str2 

Multiplication 

strl/str2 

Division 

strl**expr 

Raised to the power expr 

fn(strl) 

Function applied to a unit string 


Table 7: Combination rules and mathematical expressions for VOUnits. See 
Appx. C.4 for the complete grammar. 


log(strl) 

Common Logarithm (to base 10) 

In(strl) 

Natural Logarithm 

exp(strl) 

Exponential (e®*”^) 

sqrt(strl) 

Square root 


Table 8: Functions of units. 

ror, even if they deprecate them at some later processing stage. As described 
in Sect. 2.11, functions may be quoted to indicate that they must not be 
interpreted as in this table. Note that since functions such as ‘log’ require 
dimensionless arguments, when a quantity x is (for example) represented by 
numbers labelled with units log(Hz), that indicates that the numbers are 
related to x by the function log(x/(l Hz)). 

2.10 The numerical scale-factor 

A VOUnits unit string may start with a numerical scale-factor to indicate 
a derived unit. For example, the inch might appear as the unit of 25.4mm. 
See Appx. C.4 for the syntax of the VOUnits numerical string. 

A data provider may choose to use such a unit in order to represent a 
unit which is not listed as one of the VOUnit ‘known units’. For example, 
given a VOTable column of masses relative to Jupiter’s mass, one might label 
it as having units of 1.898E27kg rather than ’ jupiterMass’ (an ‘unknown 
unit’). The advantage of doing so is that the data consumer can translate 
the column data into well-known physical units without further information, 
and the data source is thus self-contained. The disadvantage of doing so 
is (i) that the intention might be obscured (this is a type of provenance 
information); and (ii) that the measurements may be relative to (in this 
example) the actual mass of Jupiter rather than merely expressed in those 
terms, so that the measurements should change if the actual mass were to be 
refined as a result of a recalibration, or if (in the case of a pulsar period for 
example) the unit were time-varying. The data provider retains the choice 
of which strategy to take. 
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A data provider may need to provide further metadata information, to 
clarify the meaning of such a unit, or they may judge that the meaning 
is adequately clear to the intended audience, without further complication. 
Such further information is out of scope of the VOUnits Recommendation, 
in the same way that even ‘known’ units may be ambiguous in some contexts 
(cf. Sect. 2.4). 

This Recommendation does not prescribe how many significant figures 
should be in a scale-factor, nor whether it should be interpreted as single- 
or double-precision, nor how units with scale-factors should be compared for 
equality. All of these are implementation choices for the software which is 
handling the units. 

2.11 Quoting unknown units 

The VOUnits syntax permits the use of ‘unknown units’ (that is, units not 
listed in Table 2 on page 14). There need be no syntactic indication that a 
unit is ‘unknown’; this is convenient, but creates some minor ambiguities. 

In the VOUnits syntax, base symbols may be put between single quotes 
’. . . ’ (a significant divergence from the other syntaxes). Such symbols must 
be parsed as unrecognised unit symbols which are not further examined. 

This has two consequences. Firstly, it means that an unknown symbol 
which happens to start with an SI prefix is not broken into a base symbol 
and prefix: thus ’furlong’ is parsed as expected, whereas furlong would be 
the femto-‘urlong’. Secondly, a quoted symbol is parsed as an unrecognised 
unit, even if it would otherwise indicate a known unit; thus the unit ’m’ is 
parsed as an unknown unit ‘m’, and does not indicate the metre. 

This facility means that a data provider may label data with units of, 
for example, ’martianDay’ or the ’B’, while still remaining conformant with 
the VOUnits Recommendation, and without risking the leading m being mis- 
parsed as an SI prefix, or the ‘B’ being misparsed as a ‘byte’. 

Quoted units can take prefixes (they are ‘unknown units’, so there are no 
restrictions on their usage), so that m’furlong’ is a milli-fur long, and m’m’ 
is a milli-‘m’. The only permissible prefixes are those of Table 3. 

2.12 General rationale (informative) 

2.12.1 Deviations from other syntaxes 

The aspiration of the VOUnits work was that the syntax should be as much 
as possible in the intersection of the various pre-existing syntaxes, so that 
a unit string which conformed to the VOUnits syntax would be parsable in 
each of those other syntaxes. This has not been possible in fact, for four 
reasons. 

1. The CDS syntax permits only a dot to indicate a product, and the 
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OGIP syntax only a star, while FITS permits both. The VOUnits 
syntax uses a dot, so that non-trivial OGIP unit strings are therefore 
necessarily invalid VOUnits strings in this one respect. 

2. The VOUnits syntax permits (but does not require) a scale-factor at 
the beginning of the string, which is not a power of 10. Only the GDS 
syntax permits a similar factor. See Sect. 2.10 for discussion. 

3. Only the VOUnits syntax permits quoted units. 

4. Only the VOUnits syntax permits the use of the binary prehxes of 
Table 3. 

The hrst is both unavoidable in specihcation, and largely unavoidable in 
practice; the others are VOUnit extensions which a data provider may of 
course decline to take advantage of. 

The scale-factor and quoted-units extensions are intended to support the 
case where the data provider wishes to distribute data including a unit which 
is ‘unknown’, but which the provider nonetheless feels is necessary or useful; 
this should be done only after weighing the considerations of Sects. 2.10 
and 2.11. For the sake of consistency, and in order to allow constructions 
such as M’jupiterMass’, the grammar permits quoted units to take scaling 
prefixes; this is not often likely to be a good idea. 

A VOUnits string which avoids the three extensions above will be parsable, 
with the same meaning, in the GDS and FITS syntaxes, and will be parsable 
by an OGIP parser if dots are replaced by stars. 

2.12.2 Restrictions to ASCII 

As described above, VOUnit unit strings are restricted to printable ASGII 
characters. While the two most prominent uses of these strings will be within 
VOTable attributes (unit="...") and in XML serialisations of a data model 
(for example <unit>.. .</unit>), we also intend them to be usable within FITS 
hies and within databases. Neither of the latter two contexts is necessarily 
unicode-friendly, so permitting non-ASGII characters in a unit string (such 
as A or p) is more likely than not to cause trouble. 

Similarly, forbidding spaces within VOUnit strings removes one (minor) 
complication when recognising them in use. 

2.12.3 Other units, and unit-like expressions 

As noted above, the VOUnits syntax does not include structures such as 
arrays or tuples of numbers. We include in this category sexagesimal co¬ 
ordinates, calendar dates (in ISO-8601 form or otherwise), RA-Dec pairs, 
and other structured quantities serialised as strings. Each of these is well- 
specihed elsewhere, and would require a separate parser if encountered in 
data. 
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Existing VO standards already recommend that coordinates be expressed 
in decimal degrees. 

Qnantities like the Modified Jnlian Date (MJD) are also not recognized 
VOUnits. As described in Sect. 1.2, the qnantity MJD can be seen as a 
concept (described by the appropriate UCD or ntype), and the corresponding 
valne will most likely be expressed in days, so the VOUnit will be d. There 
is no need to overload VOUnits to incorporate the description of concepts 
themselves. 

The notion of nnit conversion and qnantity manipnlation is discnssed in 
Sect. 3.3. 

3 Use cases and applications (informative) 

3.1 Unit parsing 

The rnles defined in Sect. 2 allow ns to bnild VOUnit parsers. Several services 
can be bnilt on top of a VOUnit parser: 

1. Validation. A service checking that a VOUnit is well written. The 
ontpnt of snch a service can have different levels: fnlly valid nnit; valid 
syntax, bnt not the preferred one (e.g., nse of deprecated symbols); 
parsing error. 

2. Explanation. A service retnrning a plain-text explanation of the nnit 
label. 

3. Typesetting. A service retnrning an eqnivalent of the nnit label snitable 

for inclnsion in a or HTML docnment. 

4. Dimensional eqnation. As described by ?, VOUnits can be translated 
into a dimensional eqnation, allowing to bnild np conversions methods 
from one string representation to another one (see also Sect. 3.3). 

3.2 Libraries 

There are a few existing libraries able to interpret nnit labels. In all cases, 
some software effort is reqnired if they are to be nsed in translating between 
data provider nnit labels, and those to be adopted by the IVOA for internal 
nse. 

One of the most widely-nsed specialised astronomical libraries is AST 
which inclndes a nnit conversion facility attached to astronomical coordinate 
systems (?). 

Another library has been developed at CDS,^ and can be tested online.® 
This library covers all the symbols and notations defined in the standard for 

®http://cds.u-strasbg.fr/resources/doku.php?id=units 
®http://cdsweb.u-strasbg.fr/cgi-bin/Unit 
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astronomical catalogues (?, §3.2), as well as additional symbols and nota¬ 
tions. 

The Unity library' is a new standalone library intended to parse unit 
strings in the VOUnits, OGIP, StdCats and FITS syntaxes; it was used as 
a vehicle for developing and testing the grammars and ideas for this present 
document. It provides yacc-style grammars for the various syntaxes, as well 
as implementing them in parsers written in Java and C. The grammars of 
Appx. C are extracted from the Unity distribution. 

3.3 Unit conversion and quantity transformation 

Unit conversion is the simple task of converting a quantity expressed in a 
given unit into a different unit, while the concept remains the same. For 
example, such a library might be able to convert a distance in pc into a 
distance in AU or km, or convert a flux from mJy to W.m-2.Hz-l. This is 
rather easy with existing libraries, using dimensional analysis or SI units as 
a reference. 

Quantity transformation consists in deriving a new quantity from one or 
several original quantities. It is more complex, because it requires having a 
precise model (a simple equation in simple cases) for computing the trans¬ 
formation. The model involves quantities, each described with a UCD or 
utype, value and VOUnit. Some of the quantities involved might be physical 
constants (e.g., Boltzmann’s constant /cb)- 
Examples of such transformations can be: 

• linear unit conversion: a distance is measured in pixel in an image, 
and needs to be transformed in the corresponding angular separation 
in arcsec. This can be done if the quantity representing the pixel scale 
is given, with its value and a compatible unit like deg/pixel. 

• converting a photon wavelength in the corresponding photon energy or 
frequency. 

• deriving the flux for a given photon emission rate (in W) from Planck’s 
constant (6.63 x 10“^^ Js), the radiation frequency (in GHz), and the 
number of photons emitted per second. 

• transforming a magnitude into a flux, as needed for SED building. 

VOUnits can help in quantity transformation if all quantities are qualified 
with proper VOUnits. 

3.4 Query languages 

Including VOUnits in queries is not an easy task. Some guidelines were 
articulated during the development of the ADQL standard. 

^https;//bitbucket.org/nxg/unity 
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1. All data providers should be encouraged to supply units for each col¬ 
umn of a table. Columns should also have associated UCDs, so that 
quantities can be properly identified. 

2. The IVOA needs to provide a parser to relate the native units to the 
standard IVOA labels (in this context, the ‘native units’ are the units 
of the underlying database table or metadata). 

3. The default response to a query which does not specify units, will be 
in the native units of the table. 

4. Where queries involve combining or otherwise operating on the content 
of columns to produce an output column with modified units, we can 
provide libraries and a parser to assist in assigning and checking a 
new unit, and attach this to the returned values via the SQL CAST 
operator. This is implemented already in database related applications 
such as Saada,® for instance. If any column used in responding to a 
query lacks a necessary unit, the output involving that column will be 
unitless. 

5. If the user wants to change the output units with respect to the table 
units, this could be done by specifying the units in the initial SELECT 
statement. There are several issues to consider: 

(a) Does the user also need to include the conversion expression, or 
does the unit parser take care of that? 

(b) Can the user use this to assign units (based on prior knowledge) 
to output from a column lacking a unit? 

3.5 Broader use in the VO 

Different VO entities require and consume metadata with units attached like 
registries, applications and interoperate via protocols. Fig. 2 illustrates the 
places where the IVOA could intervene to ensure consistent use of units. 


®http://saada.unistra.fr/ 
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Astronomer 




Results (in units) 



Application 
(uses units) 


t 


Application 

Publisher 


Standardise unit labels at this stage? Convert units at this stage? 


Figure 2: This shows the levels at which conversions might be done. Plain 
arrows: At the point where an astronomer or data provider submits input to 
the VO, we should provide tools to ensure that units are labeled consistently 
according to VOUnits. This implies that a units parsing step is included prior 
to metadata ingestion into the VO. Dashed arrows: Conversions required to 
supply results to the user in specified or user-prefered units e.g., J. s-1 to W, 
are done where and when they are required. 
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A Current use of units (informative) 

Many other projects have already produced lists of preferred representations 
of units. Those most commonly used in astronomy are described in this 
section. 

The four first schemes described below are used as references for the 
comparison tables presented later in this document. 

A.l lAU 1989 

In the section 5.1 of its Style Manual, the lAU gives a set of recommendations 
for representing units in publications (?). This document therefore provides 
useful reference guidelines, but is not directly applicable to VOUnits because 
the recommendations are more intended for correct typesetting in journals 
than for standardized metadata exchange. The lAU style will be summarized 
in the second column of the comparison tables. 

A.2 OGIP 1993 

NASA has defined a list of character strings specifying the basic physical 
units used within OGIP (Office of Guest Investigator Programs) FITS files 
(?). Rules and guidelines on the construction of compound units are also 
outlined. 

HEASARG datasets follow these conventions, presented in the third col¬ 
umn of the comparison tables. 

A.3 Standards for astronomical catalogues 

The conventions adopted at GDS are summarized in the Standards for Astro¬ 
nomical Gatalogues, Version 2.0 (?, §3.2). They are presented in the fourth 
column of the comparison tables. 

A.4 FITS 2010 

In Section 4.3 of the reference FITS paper, ? describe how unit strings are 
to be expressed in FITS files. The recommendations are presented in the 
fifth column of the comparison tables. 

A.5 Other usages 

http://arxiv.org/pdf/astro-ph/0511616 Dimensional Analysis applied 
to spectrum handling in VO context (?) offers a mathematical frame¬ 
work to guess and recompute SI units for any quantity in astronomy. 
http://unitsnil.nist.gov The NIST (National Institute of Standards & 
Technology) project UnitsML builds up an XML representation of 
units at the granularity level of a simple symbol string. 
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https : //www. jcp. org/en/jsr/detail?id=275 JAVA JSR-275 specifies Java 
packages for the programmatic handling of physical quantities and their 
expression as numbers of units. 

aips++ and casacore These systems (see http://aips2.nrao.edu/docs/ 

aips++. html and http: //code. google. com/p/casacore/) contain mod¬ 
ules handling units and quantities with high precision. The packages 
are mainly in use for radio astronomy but are designed to be mod¬ 
ular and adaptable (NB: contrary to the statement on the casacore 
link, aips++ is still very much in use as the toolkit behind the CASA 
package). 
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B History: Comparison of syntaxes (informative) 

In this section, we compare the pre-existing unit-string syntaxes and the 
current standard described in this document. We have included these com¬ 
parisons for more-or-less historical reasons, to try to highlight the variations 
between syntaxes, and so illustrate the motivation for this Recommendation, 
namely that the current practice, though it may at hrst appear to have rough 
consensus, is disturbingly heterogeneous. 



lAU 

OGIP 

StdCats 

FITS 

YOU 

Units are 
strings of chars 


YES 


YES 

YES 

Case sensitive 

YES 

YES 

YES 

YES 

YES 

Character set 



No 

spaces 

ASCII 

text 

ASCII 

print¬ 

able 


Table 9: Comparison of string representation and encoding. 



lAU 

OGIP 

StdCats 

FITS 

YOU 

The 6+1 base 

SI units (use s, 
not sec, for 
seconds) 

m, s, A, K, mol, cd 

(1) 

kg 

g 

kg, but 

g 

allowed 

g 

Dimensionless 
planar and 
solid angle 

rad, sr 

rad, sr, 
deg (2) 

rad, sr 

Derived units 
with symbols 

n 

Hz, N 

S, F, 

ohm 

Pa, J, W, 
Wb, T, H, 

Ohm 

C, V, 

Im, lx 

Ohm 

Ohm 


Table 10: Comparison of base units. Notes: (1) unit is kg, but use g with 
prehxes; (2) deg preferred for decimal angles 
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lAU 

OGIP 

StdCats 
sec. 3.2.3 

FITS 

VOU 

Scale factors, 

(mnltiple) 

prefixes 

b 

d, c 

da, h. 

, in, n, p, f, a 
k, M, G, T, P, E 

u 

z, y, Z, Y 

u 

Z, y, 

Z, Y 

Prefix-symbol 

concatenation 

(1) 

(2) 

no 

space 

no 

space 

(im¬ 

plicit) 

no 

space 

Prefix-able 
symbols 

Not kg: 
nse g 

( 3 ) 

all 

all 

( 4 ) 

Use componnd 

shonld 

shonld 

mnst 

mnst 

mnst 

prefixes 

not 

never 

not 

not 

not 


Table 11: Comparison of scale-factors. Notes: (1) no space, regarded as 
single symbol; (2) no space, regarded as a single nnit string; (3) all nnits 
above, and eV, pc, Jy, Crab Only mCrab allowed; (4) all (except P for a). 
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lAU 

OGIP 

StdCats 

FITS 

vou 

minute 

min, 

min 

min 

min 

min 

hour 

h, 

h 

h 

h 

h 

day 

d, 

d 

d 

d 

d 

year 

a 

yr 

a, yr 

a, yr 
(1) 

like 

FITS 

arcsecond 

n 

arcsec 

arcsec 

arcsec 

arcsec 

arcminute 

! 

arcmin 

arcmin 

arcmin 

arcmin 

degree (angle) 

o 

deg 

deg 

deg 

deg 

milliarcsecond 

mas (use 
nrad!) 


mas 

mas 

mas 

microarcsec 



uarcsec 


(2) 

cycle 

c, “ (5) 




not 

used 

astronomical 

unit 

au 

AU 

AU 

AU 

AU 

parsec 

pc 

pc 

atomic mass 

u 



u 

u 

electron volt 

eV 

eV 

jansky 

Jy 

Jy 

Celsius degree 

°C for 

meteo¬ 

rology, 

other¬ 

wise 

K 




not 

used 

century 

(3) 




(4) 


Table 12: Comparison of astronomy-related units. Notes: (1) Pa (peta-a) 
forbidden; (2) no dedicated symbol, use uarcsec; (3) ha, cy should not be 
used; (4) no dedicated symbol, use ha or hyr; (5) superscript-‘c’ has also 
been used to denote ‘radian’ 
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lAU 

OGIP 

StdCats 

FITS 

VOU 

angstrom 

A 

angstrom 

O.lnm 

Angstrom 

angstrom, 

Angstrom 

micron 

h 




not used 

fermi 

no 

symbol 




not used 

barn 

b 

barn 

barn 

barn 

barn 

cubic 

centimetre 

cc 




no dedi¬ 
cated 
symbol 

dyne 

dyn 




not used 

erg 

erg 

erg 

(1) 

erg 

erg 

calorie 

cal 




not used 

bar 

bar 




not used 

atmosphere 

atm 




not used 

gal 

Gal 




not used 

eotvos 

E 




not used 

gauss 

G 

G 


G 

G 

gamma 

7 




not used 

oersted 

Qe 




not used 

Imperial, 

non-metric 

should 
not be 
used 




not used 


Table 13: Comparison of symbols deprecated by lAU (from ? table 7, “Non- 
Si units and symbols whose continued use is deprecated”). Note: (1) no 
symbol: mW/m2 is used for ergcm“^s“^. 
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lAU 

OGIP 

StdCats 

FITS 

YOU 

magnitude 

mag 

mag 

rydberg 



Ry 

Ry 


solar mass 

Mo 


solMass 

solMass 

same 

as 

FITS 

solar 

luminosity 



solLum 

solLum 

solar radius 



solRad 

solRad 

light year 


lyr 


lyr 

count 


count 

ct 

ct, 

count 

photon 


photon 


photon, 

ph 

rayleigh 




R 

pixel 


pixel 

pix 

pix, 

pixel 

debye 



D 

D 

relative to Sun 



Sun 

Sun 

channel 


chan 


chan 

bin 


bin 


bin 

voxel 


voxel 


voxel 

bit 



bit 

bit 

byte 


byte 

byte 

byte 

adu 




adu 

beam 




becim 



Crab 

avoid 

use 



not 

used 

No unit, 
dimensionless 


blank 

string 

- 


empty 

string 

Percent 



7. 


1 

unknown 


UNKNOWN 



unknown 


Table 14: Miscellaneous other symbols. 
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lAU 

OGIP 

StdCats 

FITS 

YOU 

Multiplication 

space or 
dot (1) 

space or 
star (2) 

dot 

space, 

dot or 
star (3) 

dot 

Division 

per (4) 

/ (5) 

/, no 
space 

/, no 
space 

/, no 
space 

Use of multiple 

/ 

never 

allowed 

allowed 

discour¬ 

aged 

(6) 

never 

sym raised to 
the power y 

super¬ 

script 

(7) 

(8) 

(9) 

** 

Exponential of 
sym 


exp(sym) 


exp(sym) 

exp(sym) 

Natural log of 
sym 


In(sym) 


In(sym) 

In(sym) 

Decimal log of 
sym 


log(sym) 

[sym] 

log(sym) 

log(sym) 

Square root of 
sym 


sqrt(sym) 


sqrt(sym) 

sqrt(sym) 

Other math 


(10) 


not 

used 

not 

used 

0 


allowed 

allowed 

optional 

around 

powers 

allowed 

powers 

super¬ 

scripts 

(11) 

integers 

(12) 

(12) 

Numeric factor 

not 

used 

(13) 

allowed 

(14) 

(15) 


Table 15: Mathematical expressions and symbol combinations. This table 
is derived from the specification texts; any deviation from the grammars of 
Appx. C.1-C.3 is unintentional. This table, and those appendices, are infor¬ 
mative; only Appx. C.4 is normative. Notes: (1) space, except if previous 
unit ends with superscript; dot (.) may be used; (2) one or more spaces 
OR one asterisk (*) with optional spaces on either side; (3) single space 
OR asterisk (*, no spaces) OR dot (., no spaces); (4) use negative index 
or solidus (/); (5) solidus (/) with optional spaces on either side, space not 
recommended after / OR negative index; (6) may be used, but discouraged, 
‘math precedence rule’; (7) sym^^Cy) parenthesis optional if y > 0; (8) noth¬ 
ing - symy, and use sign for 10+21; (9) symy OR sym**(y) OR syrn'Cy), 
no space; (10) /(sym), where / is sin, cos, tan, asin, acos, atan, sinh, cosh, 
tanh; (11) decimal and integer fractions allowed; (12) integer (sign and () op¬ 
tional), OR decimal or ratio between (); (13) should be avoided; only powers 
of 10 allowed; should precede any unit string; (14) optional 10**k, lO^k, or 
10±k; (15) see Sect. 2.10. 
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C Formal grammars 


Subsection C.4 is Normative, the other subsections are Informative. 

In this section we provide formal (yacc-style) grammars for the four 
ASCII-based syntaxes discussed in this document. The FITS, OGIP and 
CDS grammars are not normative: the corresponding specification docu¬ 
ments do not provide grammars, and instead describe the syntaxes in text, so 
that the grammars here are deductions from the specification text. This un¬ 
fortunately means that some of these syntaxes were discovered to be ambigu¬ 
ous. These ambiguities are discussed in the sections below. We recommend 
that VO applications parse these syntaxes in a way which is consistent with 
the grammars here. The grammar for the VOUnits syntax, in Appx. C.4, is 
normative. 

We believe that the grammars below are such that if a string successfully 
parses in two distinct grammars, it means the same in both. 

The grammars here are from release 1.0 of the ‘Unity’ package at https: 
//bitbucket. org/nxg/unity, which includes machine-readable grammars, 
lists of recommended units, and a collection of test cases. These are also ex¬ 
tracted in machine-readable form at https: //code. google. com/p/volute/ 
source/browse/trunk/projects/std-vounits/unity-grammars.zip. 

In these grammars, the common terminals are as given in Table 16. Lex¬ 
ers must not swallow whitespace in generating these terminals; whitespace 
is permitted in a units string only where the corresponding grammar permits 
the WHITESPACE terminal. 


CARET 

DIVISION 

DOT 

FLOAT 

LITIO 

0PEN_P / CL0SE_P 
SIGNED_INTEGER 

STAR 
STARSTAR 
STRING 
UNSIGNED_INTEGER 
WHITESPACE 


the ~ character (SEig) 
the solidus, / ( 2 F 10 ) 

the dot/period/full-stop character ( 2 Ei 6 ) 
a string matching the regular expression 
[-+]?[0-9]+\. [0-9] + 

a literal string ‘ 10 ’ (the sequence 31i6 SOig) 
parentheses (28i6 and 29i6) 

an integer with a required leading sign, so matching 
the regular expression [-+] [0-9] + 
the asterisk ( 2 Ai 6 ) 
a pair of asterisks, ** 

a non-empty sequence of letters [a-zA-Z] + 
an integer with no leading sign [0-9] + 
a non-empty string of space characters ( 20 i 6 only) 


Table 16: The terminals used in the grammars; the notation NNig indicates 
hexadecimal ASCII character numbers; the digits are 30i6 to 39i6, the letters 
are 41i6 to SAig and 6 I 16 to TAig, and the sign characters are 2 Bi 6 and 2 Di 6 . 
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C.l The FITS grammar (informative) 

For the FITS units syntax, see section 4.3 of ?, and its associated tables. 
Our preferred FITS grammar is in Table 17 on the following page. 

As noted above in Sect. 2.9, the FITS specification isn’t completely clear 
on the topic of solidi, saying “[t]he lAU style manual forbids the use of more 
than one solidus (/) character in a units string. However, since normal 
mathematical precedence rules apply in this context, more than one solidus 
may be used but is discouraged”. This does not really resolve the question of 
whether, for example, kg/m s should be parsed as kgm“^ s“^ or as kgm“^ s, 
since this is a question of both operator precedence and (left-)associativity, 
where there might be different rules internationally, and conflicts between 
mathematical and programming-language rules. Most people would proba¬ 
bly parse it as kgm“^s“^, but we trust that most educators would oblige 
students to rewrite the expression on the grounds that any ambiguity is too 
much. Here, we resolve the ambiguity by declaring that there can be only a 
single expression to the right of the solidus. 

It is a consequence of this that nothing can be successully parsed in two 
different grammars, with different meanings. If the right-hand-side of the 
division could be a product_of _units, then kg /m s would parse in both 
the FITS and OGIP syntaxes, but mean kgm“^ s“^ in the FITS syntax, and 
kgm“^ s in the OGIP one. 

The FITS specification permits a leading numeric multiplier, but “[cjreators 
of FITS files are encouraged to use the numeric multiplier only when the 
available standard scale-factors of [SI] will not suffice”. 

The FITS specification permits m(2), to indicate the square of unit ‘m’. 
The grammar has to special-case this, in order to distinguish it from function 
application. 

Other ambiguities: 

• The FITS specification may or may not be intended to permit 10+3 
/m, but we don’t. 

• It is possible to read the FITS spec as permitting m" 1.5, without paren¬ 
theses. We take it to be invalid here. 


35 


Table 17: The FITS grammar. See Appx. C.l. 
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C.2 The OGIP grammar (informative) 

For the OGIP units syntax, see ?. Our preferred OGIP grammar is in Ta¬ 
ble 18. 

The OGIP specification somewhat reluctantly concedes (in its section 
3.2) that “occasionally it may be preferable to include [leading scale] factors 
on the grounds of user-friendliness”, but that “[t]he inclusion of numerical 
factors should therefore be avoided wherever possible”, and it is “suggested” 
that the scale-factor should in any case be restricted to powers of 10. 

Specification ambiguities: 

• The OGIP specification permits a space between the leading factor and 
the rest of the unit (by implication from the provided examples). 

• The specification does not indicate the format of the numerical factor 
in the case where it is not a power of ten. We have suggested FLOAT 
here (see Table 16 on page 34). 

• OGIP recommends having no whitespace after the division solidus, but 
does not forbid it; therefore we permit it in this grammar. 

• From its specification text, OGIP appears to permit strl**y, where 
y can be a float, even though none of its examples include this. The 
same interpretive logic would appear to permit in**3/2, but this seems 
to run too great a risk of being misparsed, and we forbid it here. 

• In the same place, the text suggests that strl**y may omit the brack¬ 
ets ‘if y is positive’, but the context suggests that the intention is to 
permit this if y is unsigned. In the grammar here, we permit the omis¬ 
sion of the brackets only if y is unsigned - that is, in**+2, like in**-2, 
is forbidden. 


Table 18: The OGIP grammar. Note that the FLOAT in the scalefactor 
production must be a power of ten. See Appx. G.2. 
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C.3 The CDS grammar (informative) 

For the CDS units syntax, see (?, §3.2). Our preferred CDS grammar is in 
Table 19. It requires additional terminals, described in Table 20. 

Specification ambiguities: 

• The CDS document indicates that units should be raised to powers by 
concatenation of the unit string with an integer, but does so rather 
elliptically, so that it is not clear whether in+2 is permitted (the rele¬ 
vant examples show this as in2). We take this to be permitted in this 
grammar. 

• The specihcation does not indicate the format of the numerical factor 
in the case where it is not a power of ten and not a CDSFLOAT. We have 
suggested FLOAT here (see Table 16 on page 34). 

• The document does not specify or illustrate how kg/m/s should be 
parsed. Since the document mentions the OGIP standard (even though 
it does not permit OGIP’s syntax for powers, in**2), we take it that 
this is valid, and equivalent to kgm“^ s“^. 

This specihcation places no restrictions on the leading scale-factor. 

Table 19: The CDS grammar. See Appx. C.3 for discussion, and Table 20 
for the additional terminals. 


CDSFLOAT 


0PEN_SQ 

CL0SE_SQ 

PERCENT 


a string matching the regular expression 

[0-9]+\. [0-9]+xl0 [-+] [0-9]+ (that is, something 

resembling 1.5x10+11) 

the open square bracket ‘ [’ (indicates logs in this syntax) 
the close square bracket ‘] ’ 
the percent character “%’ 


Table 20: Extra terminals for the CDS grammar 
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C.4 The VOUnits grammar (normative) 

The VOUnits grammar is defined by this section, by the grammar in Table 21 
(with the terminals of Table 16 on page 34 plus the extra terminals listed 
in Table 22), the list of known units of Table 2 on page 14, and the list of 
known functions of Table 8. 

The intention of the VOUnits grammar is that if a VOUnits string does 
not use the scale-factor, quoted-units or binary-prefix extensions (that is, if 
it avoids the VOUFLOAT and QUOTED_STRING terminals and is restricted to SI 
decimal prefixes), then it will be parsable, with the same semantics, by FITS 
and CDS parsers, and that it will be parsable by an OGIP parser if dots are 
replaced by stars. See Sect. 2.12.1 for discussion. In particular: 

• The product of units is indicated only by a dot, with no whitespace: 

N .m. 

• Raising a unit to a power is done only with a double-star: kg.in**2. s**-2. 

• There may be at most one division sign at the top level of an expression. 

In Table 21, the VOUFLOAT terminal is a string matching either of the 
regular expressions 

• 0\. [0-9]+([eE] [+-]?[0-9]+)? 

• [1-9] [0-9]*(\. [0-9]+)?([eE] [+-] ? [0-9]+)? 

(that is, something resembling 0.123 or 1.5e+ll). 


Table 21: The VOUnits grammar. See Appx. C.4 for discussion, and Table 22 
for additional terminals. 

VOUFLOAT see text, Appx. C.4 

QU0TED_STRING a STRING between single quote marks (ASCII 27i6) 
Table 22: Extra terminals for the VOUnits grammar 
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D Updates of this document (informative) 

The detailed (line-by-line) history of the document can be found at https: // 
code.google.com/p/volute/source/browse/trunk/proj ects/std-vounits/ 
VDUnits.tex. 

• 2014 July 2: minor adjustment, to remove ‘proposed recommendation’ 
text which was retained in the final version by accident. 

• REC-1.0-20140523: Approved as REC by Exec, at the IVOA Madrid 
Interop. 

• PR-1.0-20140513: A few rewordings, addressing comments made dur¬ 
ing the TCG review period. 

• PR-1.0-20140226: Minor wording and layout changes, following on-list 
discussion. Released for TCG review. 

• PR-1.0-20131224: 

— Grammar changes: minor (now incorporates the grammars of 
Unity vO.ll). 

— Various clarifications to the text, following on-list discussion. 

• WD-1.0-20131025: 

— Grammar changes: The ‘%’ character is now treated as a special 
case, rather than being a permitted ’STRING’ character; it’s only 
the CDS syntax that permits this character. Some readability 
adjustments to the grammars. Unit strings with leading slashes 
(eg /mS) are no longer supported in the VOUnits syntax. The 
grammars now match Unity vO.lO. 

— Changed discussion/rationale for forbidding non-ASCII charac¬ 
ters. 

— Clarified that ‘?’ - which is specified as indicating an unknown 
unit - is not part of the VOUnits grammar, and should be spotted 
by a caller before parsing begins. 

— Clarified the extra terminals which some grammars use. 

— Clarified that the ambiguity in dadu should remain unresolved, 
and the correct behaviour unspecified (is it deci-adu or deka-du?). 

• WD-1.0-20131011: Changed gramme in gram; removed color property 
to distinguish arrows in fig .2; Removed astro’l unit abbreviation from 
known-units.tex 

• WD-1.0-20130922: Responding to RFC and mailing list comments. 
Addition of quoted units and arbitrary scale-factor (so updates to 
grammars, which now match Unity vO.9). Some reformatting of ta¬ 
bles. 

• WD-1.0-20130724: Rephrasing and clarification, responding to RFC 
comments. Update unity grammars to current version (ie, version of 
2013-07-22 18:40). 
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• WD-1.0-20130701: Simplified Architecture diagram. Added example 
with scientific notation. Adjusted locations of grammar tables to try 
to keep them closer to the associated text. 

• WD-1.0-20130429: Some restructuring, some rephrasing, and a few 
layout changes. 

• WD-1.0-20130225: Large tables from section 3 moved to Appendix A. 
Short summaries of symbols added to section 3. Changes to table of 
known units for consistency with text. Added explanations for units 
Sun and byte. 

• WD-1.0-20121212: Minor typographical fixes. Added definition of 
OGIP. Removed last sentence from acknowledgements, which have 
been moved to the beginning of the document. Changed figure 1 to 
move Units in Semantics. Added ’discouraged’ in first line of Table 7. 
Color change in figure 2 and its label. 

• WD-1.0-20120801: Minor typographical fixes 

• WD-1.0-20120801: 

— Included yacc-style grammars in document. 

• WD-1.0-20120718: 

— Removed external tables refs in tables to avoid confusion. 

— Removed refs to SOFA and NOVAS. 

— Precision on the "no unit" case in text. 

— Added formal grammar in annex. 

— Minor editing and typo fixes. 

• WD-1.0-20120521: 

— Typos fixed, removed F. Bonnarel from authors. 

— One sentence rephrased in section 1.2 for clarity. 

— Clarification of g and kg issue in Sect. 2.3. 

— Added remark on Pa in Sect. 2.6. 

— Micro-arcsecond and century explained in Table 12 on page 30. 

— Table 13 on page 31 completed. 

— Added numeric factors in Table 15 on page 33 and discussion in 
text. 

• WD-1.0-20111216: Major rework of the document. 

• 0.3: initial public release. 
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